ApproxSeek: Web Document Search Using Approximate Matching

نویسنده

Wen-Chen Hu

چکیده

Conventional search engines nd Web pages by using keyword matching. Two disadvantages for keyword matching are: (i) the order of keywords is ignored; and (ii) the approximate queries submitted by users are beyond the method of keyword matching can handle. Using keyword matching for Web page search is not satisfactory. This paper proposes a Web search system, named ApproxSeek, to facilitate Web information retrieval. The ApproxSeek achieves better search results by comparing the longest approximate common subsequences, which are the modi ed longest common subsequences having fault-tolerance capability. Experimental data show the proposed method improves the Web search, though more tests need to be conducted in order to support the conclusion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Penerapan E-Service Berbasis Android pada Divisi Pelayanan Perbaikan Komputer CV Ria Kencana Ungu (RKU)

Archival information systems in government agency is one of the most used applications for daily acitivities. One feature in application management information document is searching. This feature serves to search for documents from a collection of available information based on keywords entered by the user. But some researches on a search engine (searching) concluded that the average user error...

متن کامل

Sistem Informasi Pengarsipan Menggunakan Algoritma Levensthein String pada Kecamatan Seberang Ulu II

متن کامل

Term-specific eigenvector-centrality in multi-relation networks

Fuzzy matching and ranking are two information retrieval techniques widely used in web search. Their application to structured data, however, remains an open problem. This article investigates how eigenvector-centrality can be used for approximate matching in multirelation graphs, that is, graphs where connections of many di erent types may exist. Based on an extension of the PageRank matrix, e...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل